Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse command line arguments as a std::string #330

Closed
wants to merge 5 commits into from

Conversation

huangzonghao
Copy link

@huangzonghao huangzonghao commented May 1, 2022

A straightforward implementation for #329. Works fine for enabling local program to be called from browser via system protocols. Known limitation includes no bash variable and wildcard expansion. Haven't tested for corner cases.


This change is Reviewable

This is the simplest implementation of the functionality described in
the short description. This patch only deals with  the arguments seperated by
whitespaces, and process them as is, without any shell processing like
expanding wildcards or replacing variables.
@eyalroz
Copy link
Contributor

eyalroz commented May 1, 2022

The API here should probably not be an std::string, but rather a string view. Unfortunately, this is not available before C++17, so - a const *char and length, i.e. const char* argv, std::size_t argv_length; or just const char* argv if it's a '\0'-terminated string. Once that is in, you could add a convenience method taking a const std::string&. I would avoid the unnecessary copying.

src/example.cpp Outdated
@@ -185,5 +198,8 @@ int main(int argc, const char* argv[])
{
parse(argc, argv);

// Parse argv as string
parse(argc, argv, true);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think adding all of this to the example is necessary. I would rather just have a test case that calls the new function.

@@ -1772,6 +1772,9 @@ namespace cxxopts
ParseResult
parse(int argc, const char* const* argv);

ParseResult
parse(std::string argv_str);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should take a constref.

tokens.push_back(token);
}

const char** argv = new const char*[tokens.size()];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a smart pointer here?

@huangzonghao

This comment was marked as resolved.

Copy link
Contributor

@eyalroz eyalroz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is too much redundant copying; this code needs more work.

ParseResult
Options::parse(const std::string& argv_str)
{
std::vector<std::string> tokens;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create all of these copies?

@@ -75,6 +75,16 @@ To parse the command line do:
auto result = options.parse(argc, argv);
```

Or if you have all the arguments of `argv` in a `std::string`, separated by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm happy with you forcing the use of std::string. What if I have a char*? I can't wrap it in a string, so you're forcing me to string-copy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a legit concern, though it goes beyond the scope of the problem this patch is trying to target at. Also unlike the single string case, I cannot think of a use case where you are given all the arguments in a single char*. I don't see a trivial fix for this to avoid the extra copy of the whole string, but I am happy to learn more on it and address it when possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huangzonghao : If you use a class/struct which can be implicitly constructed both from an std::string and from a char* - then you won't need the copy. That class is string_view... see my comment below.

tokens.push_back(token);
}

std::unique_ptr<const char*[]> argv(new const char*[tokens.size()]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you're going through a bunch of hoops to create strings, which are copies of the original argv string, which itself may be a copy of something else - and now you're creating copies again?

Look, none of this copying needs to happen. Use either a string_view, or your own struct with a pointer and a length, or a pair of pointers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string_view breaks the requirement of this project, and could potentially break the other projects that depend on this one. But you are right, we should be able to generate argv pointers based on the original string copy. I'll look into this later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huangzonghao : You're right, of course. By "or" I meant "or, actually, you could use"...

Also, one could use string_view lite, which is C++11-compatible. That's essentially the same thing but with more code... I've been known to implement "poor man's string_view" or "poor man's span" in some of my C++11 projects.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to bring in other libraries, and at the moment I still want to keep it C++11 compatible. However I think there is probably a nicer way to do this. I think you could probably just copy the whole string into a new buffer, replace all spaces with a zero, and store the pointers as you go in an argv vector.

@@ -21,6 +21,8 @@ class Argv {
m_args.push_back(std::move(ptr));
m_argv.get()[i] = m_args.back().get();

m_argv_str += std::string(m_argv.get()[i]) + std::string(" ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't burden the non-single-string code with this string copying.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly all tests made for the (argc, argv) pair could be used for this single string code, unless we later on decide to design a separate set of tests for single string code and put them into a different cpp file. I would suggest we put the tests for (argc, argv) pair and single string together, so that whenever a new test is added, both functions can be tested.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is not necessary. You really only need to test that the string splitting works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, I take it back! I didn't notice this is was in one of the tests... :-(

CHECK_THROWS_AS(result["nothing"].as<std::string>(), cxxopts::option_has_no_value_exception&);

auto str_result = options.parse(argv.argv_str());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think duplicating all of the test cases is a good idea. If you really wanted to test both you might be able to use a catch SECTION to do it without duplication. But then I would be ok with factoring out the string splitting into its own class and just testing that splitting the strings up gives the correct results.

@jarro2783 jarro2783 closed this Jul 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants